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(54) Method and apparatus for providing for notification of task termination in an Information 
handling system 



(57) A nnethod and apparatus for ensuring that a 
process interacting with a failing process is notified of 
the failure of that process. Each process has a unique 
process identifier (PID) associated with it. Each process 
optionally has an affinity list containing one or nwre en- 
tries, each of which contains the identifier of a process 
that is to be notified when the process fails. A process 
updates the affinity list of a target process (either itself 
or another process) by calling an affinity service of the 
operating system (OS) kernel, specifying the type of op- 
eration (add or delete), the identifier of the target proc- 
ess, the identifier of the process that is to notified, and 
the type of event that is to be generated for the process 
that is to be notified. When a process fails, a process 
termination service of the OS kemel examines the affin- 
ity list of the failing process and, for each entry in the 
list, generates an event of the specified type for the proc- 
ess specified as to be notified. 
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Description 

[0001] This invention relates to a method and appa- 
ratus for providing (or notification of tas*< termination 
and, more particularly, to a method and apparatus for s 
providing for notification of process termination in a cli- 
ent/sender system. 

[0002] Client/server computing systems are well 
known in the art. In a client/server system, a client proc- 
ess (or simply "ciienf) issues a request to a server proc- 
ess (or simply "server"), either on the same system or 
on a different system, to perform a specified service. Up- 
on receivang the request, the server process performs 
the requested service and returns the result in a re- 
sponse to the client process. 

[0003] When creating a client/sender application on a 
single system, there is frequently a need for a client to 
communicate requests to a server and to wait for the 
sender to respond. Similarly, there can be multiple server 
processes that need to communicate with multiple client 
processes. If a client is waiting for a response from a 
sen/er and the server terminates, the client process nnay 
hang in a wait until a user or operator makes a request 
to terminate the client process. Similarly, a sen/er may 
be waiting for a response from a client and have the cli- 
ent terminate. Both client and sender can add timer calls 
into their togic to cause the wait to time out, but this can 
cause unnecessary path length and requires the client 
or sender application to pick a suitable time period. 
[0004] In UNIX®-based systems, there are several 
programming constructs that can be used to keep track 
of the connection between multiple processes. If an ap- 
plication uses a forkO or spawn() service to create a 
child process, then the two processes are tied together 
by the UNIX framework. That is. if the child process ter- 
minates, the parent process is sent a SIGCHLD signal. 
If the parent process terminates, the child process is 
sent a SIGHUP signal. However, since interacting send- 
er and client processes are usually not bound together 
by this parent-child relationship, this mechanism is of 
little use as a general notification mechanism in UNIX- 
based systems. 

[0005] According to one aspect of the mention there 
is provkled a method provkJing for notification of task 
termination in an information handling system having a 
plurality of interacting tasks, the method comprising the 
steps of: defining for each of one or more target tasks 
an affinity list containing one or more entries for other 
tasks that are to be notified on terminatwn of the target 
task; in response to receiving an affinity request speci- 
fying a target task and another task, adding an entry for 
the other task to an affinity list defined for the target task; 
and in response to detecting a termination <rf a target 
task, notifying each other task contained in the affinity 
list defined for the target task. 

[0006] According to a second aspect of the inventkxi 
there is provided apparatus for providing for the notifi- 
cation of task termination in an informatbn handling sys- 



tem having a plurality of interacting tasks, the apparatus 
comprising: means for defining for each of one or more 
target tasks an affinity list containing one or nnore entries 
for other tasks that are to b& notified on termination of 
the target task; means responsive to receivhg an affinity 
request specifying a target task and another task for 
adding an entry for the other task to an affinity list de- 
fined for the target task; and means responsive to de- 
tecting a temiination of a target task for notifying each 
other task contained in the affinity list defined for the tar- 
get task. 

[0007] According to a third aspect of the invention 
there is provkled a computer program element connpris- 
ing computer program code means executable by the 
computer to: define for each of one or more target tasks 
an affinity list containing one or nrtore entries for other 
tesks that are to be notified on tenminatk)n of the target 
task; in response to receiving an affinity request speci- 
fying a target task and arnDthor task, add an entry for the 
other task to an affinity list defined for the target task; 
and in response to detecting a terminatk>n of a target 
task, notify each other task contained in the affinity list 
defined for the target task. 

[0008] Thus a solutkwi to the aforementioned problem 
is provided by the Dkl affinity sendice of the present in- 
vention, described betow. The term pkJ stands for proc- 
ess id. Both the sender and client processes have unk^ue 
PI Ds. The pki affinity servce is used to create an affinity 
or bond between the client and sender process, such that 
when one of them temiinates, a rrtechanism is provided 
to drive a si^al to notify the other waiting process. 
[0009] As will be descrtoed below, each process in the 
operating system optionally has a pkl affinity list that 
kJentifies processes that wish to be notified (via signal) 
when the process terminates. The pkl affinity servrce 
provides the mechanism for a client to add its pid to a 
sender s pkl affinity list or for the sender to add its pid to 
the client s pid affinity list. It is up to an applicatk>n to 
determine which processes use the pid affinity service. 
[0010] As an example of the operation of the present 
invention, suppose that a client is about to make a re- 
quest to a sen/er using a nrwssage queue. Prior to plac- 
ing the request on the server input queue (by issuing a 
msgsnd system call), the client calls the pkl amnity send- 
ee to add its pid to the pkl affinity list of the sender. The 
client then issues a msgsnd system call to place the re- 
quest on the sender input queue. The client then issues 
a msgrcv system call to wait for a response from the 
sender. While in this message queue wait, the sender 
may terminate. If this happens, the kernel will see the 
pd affinity list and send a signal to each process (rep- 
resented by a pkl) that is on the pid affinity list The signal 
will wake up the client process from the nnsgrcv wait and 
allow it to fail the current request and return control to 
the calling process. 

[0011] The above description of a client/sender com- 
munk:atkxi using message queues is simply one exam- 
ple of how processes may communrcate. They could al- 
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SO use shared memory, semaphores or any other com- 
munication mechanism. This example also described 
the client arxj server as simple single-threaded process- 
es. It is possftjie for a server to be multithreaded and 
handling many requests concurrently from muttipte cli- 
ents. If such a sender were to terminate, it would cause 
the notification of all the clients in its pid affinity list. 
[0012] A preferred embodiment of the invention will 
now be described, by way of exarrple only, with refer- 
ence to the accompanying drawings in which: 

Fig. 1 shows the parameter list passed to the pid 
affinity service, a process information block and the 
pid affinity list; 

Fig. 2 shows the flow and logic for adding another 
process PID to its own PID affinity list and the result 
of termination of the calling process; 

Fig. 3 shows the flow and logic for adding the caller 
s PID to the PID affinity list of a target process and 
the actions triggered by the termination of that tar- 
get process; 

Fig. 4 shows the entry to the PID affinity sen^ice as 
well as the delete entry processing; 

Fig. 5 shows the add entry logic erf the PID affinity 
service. 

[0013] Referring first to Figs. 2-3, an embodiment of 
the present invention contains a PID affinity service 221 
in the kernel address space 204 of a system also having 
one or more user address spaces 202 including proc- 
esses 206 (process A) and 216 (process B). Kernel ad- 
dress space 204 is part of an operating system (OS) ker- 
nel (not separately shown) running, together with one 
or nr»ore user programs in user address spaces 202, on 
a general-purpose computer having a central process- 
ing unit (CPU). nr«in and secondary storage, and vari- 
ous peripheral devices that are conventional in the art 
and therefore no\ shown. Although the present invention 
is not limited to any particular hardware or software plat- 
form, a preferred embodiment may be implemented as 
part of the IBM® OS/390® operating system, running 
on an IBM S/3904. 22® processor such as an S/390 Par- 
allel Enterprise Sewer™ G4 or G5 processor. 
[0014] Referring now to Fig. 1, each process h the 
system has a process informatbn block (PIB) 114 asso- 
ciated with ft. Each PIB 114 contains a process id (PID) 
116 uniquely identifying the process ard a pointer 118 
to a PI D affinity list (PAL) 1 20, as well as other itenns that 
are rK5t related to the present invention and are therefore 
not shown. For each call to the pid affinity service 221 
to add a PID to the list 120, an entry 122 is made in the 
list. Each entry 1 22 contains the PID 1 24 of the process 
to be notified of an event and an event type 126, which 
could be a signal number. 



[0015] A pid affinity parameter list (PL) IOC' contains 
the parameters specified by an applk^aiion program as 
input to the pid affinity service 221, as well as output 
from the pid affinity service 221. These parameters in- 

5 elude a function code 102. a target process parameter 
104, an event process parameter 106, an event param- 
eter 108 and a return code 110. Parameters 102-108 
are input parameters supplied by the calling application 
to the PID affinity service 221. while return code 110 is 

10 an output parameter retumed by the PID affinity service 
221 to the calling applicatbn. 
[0016] The functk>n code 102 specifies which pid af- 
finity servbe function is requested by the applrcation 
program. Supported function codes 102 are adding an 

IS entry 1 22 to an affinity list 1 20 and deleting an entry 1 22 
from an affinity list 120. 

P)017] The target process parameter 104 specifies 
the target process (as identified by its PID) whose affin- 
ity list 1 20 is the target of the operation specified by the 

^ function code parameter 102. 

[0018] The event process parameter 106 specified 
has different uses based upon the function code param- 
eter 102 specified. The event process 106 bentlfies the 
process that is to be delivered the event when the target 

25 process temninales. When an applicatran specifies a 
function code 102 to add an entry 122 to an affinity list 
120, the contents of this parameter 106 are copied into 
an entry 124 in the affinity list 120d the process spec- 
ified by the target process parameter 104. When an ap- 

30 plicaton specifies the function code parameter 102 to 
delete an entry 1 22 from an affinity list 1 20. the contents 
of this parameter 106 are compared with existing entries 
1 24 in the affinity list 1 20 of the process specified by the 
target process parameter 104. If an entry 122 with a 

35 matching process identifier 124 is found, it is cleared 
and is available to be reused. 

[0019] The event parameter 108 specifies the event 
126 to be generated when the target process 104 ter- 
minates. This parameter 108 is unused when the func- 

40 tton code parameter 102 requests deletbn of an entry 
122. When the functkxi code parameter 102 specifies 
adding an entry 122 to an affinity list 120, the contents 
of this parameter 108 are copied to an entry 126 In the 
affinity list 120 of the process specified by the target 

45 process parameter 104. 

[0020] The fifth parameter 110 contains the return 
code generated by the pkJ affinity service. It is used to 
indicate the success or failure of the pkl affinity service 
to the applicatk)n program. 

50 [0021] Fig. 2 shows the usage of the pkJ affinity sen^- 
ice 221 when a client program adds its PID to the pki 
affinity list 120 of a server. Fig. 2 shows user address 
spaces 202 and a kernel address space 204. The kemel 
address space 204 is where services are provkJed that 

55 allow applcations to connmunkate with other user ad- 
dress spaces 202. In this example, user address spaces 
202 include a client address space 206 (process A) that 
is communicating with a server address space 216 
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(process B) 

[0022] The client 206 initially assigns a work request 
to the seiver 216 (step 208). As discussed earlier, one 
means of doing this is by placing a message on a mes- 
sage queue. After assigning the work request at step 5 
208. the client 206 waits for a response from the server 
21 6 by invoking a wait function 21 2 in the kernel address 
space 204 (step 210). The wait furrction 212 couki be a 
general-purpose wait functkxi, or it couW be a function 
tike msgrcv that waits for a message or a signal. This is io 
standard programming practice on UNIX systenns. As 
described, for exanrtple. in w R. Stevens, UNIX Network 
Programming, 1990. pages 126-137. incorporated 
herein by reference, in a UNIX system a process wishing 
to send a message to another process may issue a msg- 
snd system call to place a message in a message 
queue. That other process may in tum issue a msgrcv 
system call to retrieve the message from the message 
queue. 

[0023] In the sen/er space 216, shown as process B, 
the server receives the work request from the client 206 
(step 21 8). This could be accomplished using a function 
like msgrcv to receive a message placed on a message 
queue by the client 206 at step 208. After receiving the 
work request at step 218, the sender 216 calls the p'td 
affinity serw'tce (pid_affinity) 221 of the present invention 
with a function code 102 to add, a target process PID 
104 of process B (itselO. an event process 106 of proc- 
ess A. and an event 108 whk:h could be a partrcular sig- 
nal (step 220). Once this step is completed, shouW an- 
ything happen to terminate the server process 21 6. the 
client process 206 is guaranteed to be notified with the 
requested event 108. 

[0024] Next, sender 216 processes the work request 
assigned at step 208 (step 220). Assuming no errors oc- 
cur, the server 216 processes the work request (step 
222) and then notifies the client 206 of the completion 
(step 226). This could be accomplished by sending a 
message to the client 206 with the results of the work 
request. The msgsnd by the server 216 woub wake up 
the client 206 in a msgrcv wait 21 2. After notifying client 
process 206 at step 226, the server 216 calls the pkJ 
affinity sen^ice 221 with functkxi code 102 to delete an 
entry 1 22 in the PI D affinity list 1 20 for a target process 
104 set to process B 216 (itself) (step 228). The event 
process 1 06 is set to process A 206. After the pid affinity 
service 221 completes the request, the entry 122 for the 
client process 206 is removed from the PID affinity list 
120 for the server process 216. 
[0025] During this sender processing, suppose a ter- 
minating event 224 occurs, which prevents the sender 
from completing the work request at step 226. In this 
case, the kernel 204 gets control in process termination 
230. As part of process termination 230, the kernel han- 
dles any enUies 122 in the PID affinity list 120 for the 
terminating process 216. If an entry in the PID affinity 
list 120 is filled in (step 232), then the kernel generates 
the event 126 and targets this event to the PID 124 in 



the entry 122 of the PID affinity list 120 (step 234). 
[0026] The generatbn of the event at step 234 causes 
the target process 206 to be resumed from its wait con- 
drtbn 212 (step 236) and triggers the delivery of the ab- 
normal event 1 26 to an event exit 238 of process A 206. 
The client code in the event exit 238 is notified of the 
terminatbn of the server 216 (process B) from which it 
was awaiting a response (step 240). The client event 
exit 238 can then decide v^rfiether to terminate or retry 
the request. What the client does when notified is not 
part of the present inventbn and is therefore not de- 
scribed.. 

[0027] Fig. 3 shows another nrxxJel supported by the 
pkd affinity service 221 . In this case, a client process 302 
(process C) determines the PID of a server process 320 
(process D) with which it will soon communrcate. This 
may be accomplished with shared memory, configura- 
tion files or other means not related to the present in- 
vention. The client process 302 then calls the pkJ affinity 
servrce 221 with a function code 102 of add, a target 
process 104 PID for process D 320 (the server), an 
event process 106 set to process C 302 (the client, itself) 
and the event 108 it wishes to receive if the server 320 
terminates while processing its request (step 304). 
[0028] The client 302 then assigns work to the server 
process 320 via a message queue or other commun ca- 
tion mechanism (step 306). The client 302 then caOs the 
wait sen/ice 212 to wait for a response from the server 
320 (step 308). The wait sen/ice 212 puts the client 302 
to sleep until the requested function completes or an ab- 
normal event is received. 

[0029] In the meantime, the sen/er 320 has received 
the work request (step 322) and is processing the work 
(step 324). If all works successfully, the server 320 no- 
tifies the client 302 when the work completes (step 328). 
This notification at step 328 causes the client process 
302 to exit the wait function 21 2 with a successful return 
code. Upon receiving control back from wait, the client 
302 calls the pid affinity service 221 to undo the call 
made at step 304 (step 310). This call at step 310 will 
set the functton code 102 to request delete, the target 
process 104 will kientify server process D 320 and the 
event process 106 will identify this client 302. 
[0030] If a terminating event 326 hits the server 320. 
then it will trigger the process terminatton service 
(processjerm) 230. Process termination sen/ice 230 
will mn through the PID affinity list 120 for server proc- 
ess D 320 and for each entry in the PID affinity list (step 
232), it will generate 224 the requested event 1 26 to the 
target PID 124 (step 234). In this case, the target PID 
124 Wentifies client process C 302 and the event 126 is 
what was passed in the event parameter 108 in step 
304. 

[0031] When the event is generated at step 234, it 
causes client process C 302 to be taken out of the wait 
212 with an interrupt (step 340). The wait function 212, 
instead of returning to the caller after step 308, now 
passes control to the event exit 311 . The event exit 311 
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is notified of the termination of server process D (step 
312). At this point, the client code 302 can either termi- 
nate, retry the request or request a different service. 
[0032] Fig. 4 shows the processing of the PID affinity 
service 221 . On entry, the service 221 validates the call- 
er s parameters (step 402). If the function code 102. tar- 
get process PID 104. event process PID 106. or event 
108 is invalid, then the sen^ice 221 sets a unique vailing 
return code (step 404) and returns to the caller (step 
406). Assuming all parameters are valid, the sen/ice 221 
obtains a process lock for the target process 104 (step 
408). This lock serializes updates to the PID affinity list 
1 20 (here after referred to as PAL) d the target process 
104 tor multiple callers. 

[0033] If the target process 104 does not yet have a 
PAL 1 20 (step 41 0), then storage is obtained for the PAL 
1 20 and the locatkxi of the PAL 1 20 is stored in the Proc- 
ess Infomnation Block (PIB) 114 in field 116 (step 412). 
Next the function code 102 is tested to detenmine wheth- 
er add or delete processing is requested (step 416). If 
add processing is requested, processing is as described 
in Fig. 5 (step 418). 

[0034] For delete processing, the PAL 1 20 is scanned 
for an entry 122 that has a PID 124 that matches the 
event process PID 106 passed as input (step 414). If a 
matching entry 122 is found (step 420), then the entry 
122 is cleared and the last entry 122 in the PAL 120 is 
nnoved to the cleared entry to keep the table packed 
(step 422). The process lock is then released and con- 
trol is returned to the caller (step 406). If the entry 122 
is not found, then the process kx;k is released and con- 
trol is returned to the caller without performing the de- 
letk>n step 422 (step 406). 

[0035] Fig. 5 shows the processing to add an entry to 
the PAL 120. The target process PID 104 is tested (step 
502) to determine if it is the same as the caller s PID 
116. If they match, it means that if the calling process 
terminates, it will cause a signal (event 108) to be sent 
to the event process 106. Before adding the entry 122 
to the PAL 1 20. a test is made to determine if the calling 
process is albwed to send a signal (event 108) to the 
event process 106 (step 504). If the caller is not permit- 
ted to send the signal (event 108), then the service sets 
an error code (step 508). releases the process lock and 
returns to the caller (step 518). 
[0036] Once past the initial tests, the code loops 
through the PAL 1 20 (step 506). Looking at an entry 1 22 
in the PAL 120, if the current PID 124 is the same as the 
event PID 108 (step 510), then this entry 122 is overlaid 
by storing the event pkJ 106 over the PID in the entry 
124 and the event 108 over the event 126 in the entry 
1 22 (step 51 2). If the PIDs don t match at step 510, then 
if there are more entries in the PAL 120 (step 514). the 
loop continues at step 506. 

[0037] If the event PID 106 is not found in the PAL. 
then a new entry 122 is chosen. This will normally just 
use the next unused entry 1 22 in the PAL 1 20. If the PAL 
1 20 is full, a new larger PAL is obtained, the old PAL 1 20 



is copied into the new PAL antJ thr edflress of the new 
PAL is stored in the PIB 114 in fie d 116. Since the proc- 
ess is locked (step 408), this can be done safely. After 
copying the old PAL to the new PAL, the old PAL is freed. 

5 The new entry is then stored as in step 51 2 using an 
unused entry 122 in the PAL The process kx:k is re- 
leased and control is returned to the caller (step 518). 
[0038] Although a particular embodiment of the inven- 
tion has been shown and described, various modifica- 

10 tions and extenskxis within the scope of the appended 
claims will be apparent to those skilled in the art. 



Claims 

IS 

1. A method of providing for notificatkjn of task termi- 
natkxi in an informatkDn handling system having a 
plurality of interacting tasks, the method comprising 
the steps of: 

20 

defining for each of one or more target tasks an 
affinity list containing one or more entries for 
other tasks that are to be notified on termination 
of the target task; 

25 

in response to receiving an affinity request 
specifying a target task and another task, add- 
ing an entry for the other task to an affinity list 
defined for the target task; and 

30 

in response to detecting a termination of a tar- 
get task, notifying each other task contained in 
the affinity list defined for the target task. 

35 2. The method of claim 1 in which the affinity request 
originates from the target task, 

3. "Rie nrwthod of claim 1 in which the affinity request 
originates from the other task. 

40 

4. The nnethod of claim 1 in which the affinity request 
is of a first type, the method comprising the further 
step of: 

in response to receiving an affinity request of 
45 a second type specifying a target task and another 
task, deleting an entry for the other task from the 
affinity list defined for the target task. 

5. The method of any preceding claim in whk:h the 
so adding step comprises the steps of: 

determining whether an affinity list is already 
defined for the target task; 

55 if an affinity list is already defined for the target 

task, adding an entry for the other task to the 
affinity list defined for the target task; and 
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if an affinity list is not already defined for the 
target task, defining an affinity list for the target 
task and adding an entry for the other task to 
the affinity list defined for the target task. 

6. The method of any preceding claim in which the 
tasks are processes havrig separate address spac- 
es. 

7. The method of claim 1 in whch the tasks are user 
tasks and the steps are performed by an operating 
system kernel. 

8. The method of claim 1 in which the affinity request 
specifies a type of operation to be performed on the 
affinity list defined for the target process. 

9. The method of claim 1 in which the affinity request 
specifies an event to be generated for the other task 
upon terTninatk)n of the target task. 

10. Apparatus for providing for the notificatton of task 
termination in an information handling system hav- 
ing a plurality of Interacting tasks, the apparatus 
comprisffig: 

means for defining for each of one or more tar- 
get tasks an affinity list containing one or more 
entries for other tasks that are to be notified on 
termination of the target task; 

means responsive to receiving an affinity re- 
quest specifying a target task and another task 
for adding an entry for the other task to an af- 
finity list defined for the target task; and 

means responsive to detecting a termination of 
a target task fa notifying each other task con- 
tained in the affinity list defined for the target 
task. 



and 

means for defining an affinity list for the target 
task artd adding an entry for the other task to 
5 the affinity let defined for the target task if an 

affinity list is not already defined for the target 
task. 

1 3. A computer program element comprising connputer 
10 program code means executable by the connputer 

to: 

define for each of one or more target tasks an 
affinity list containing one or more entries for 
15 other tasks that are to be notified on termination 

of the target task; 

in response to receiving an affinity request 
specifying a target task and another task, add 
20 an entry for the other task to an affinity list de- 

fined for the target task; and 

in response to detecting a termination of a tar- 
get task, notify each other task contained in the 
25 affinity list defined for the target task. 

1 4. The computer program element of claim 1 3 in whch 
the affinity request is of a first type, further compris- 
ing computer program code means executable by 

30 the computer to: 

in response to receiving an affinity request of 
a second type specifying a target task and another 
task, delete an entry for the other task from the af- 
finity list defined for the target task. 

35 

1 5. The computer program element of claim 1 3 or claim 
14 embodied on a computer readable medium. 



40 



11. The apparatus of claim 10 in which the affinity re- 
quest is of a first type, the apparatus further com- 
prising: 

means responsive to receiving an affinity re- 45 
quest of a second type specifying a target task and 
another task for deleting an entry for the other task 
from the affinity list defined for the target task. 



12. The apparatus of claim 1 in whk;h the adding means so 
comprises: 



means for determining whether an affinity list is 
already defined for the target task; 

means for adding an entry for the other task to 
the affinity list defined for the target task if an 
affinity list is already defined for the target task; 
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